In the training process of manipulator path planning algorithm, the training efficiency of manipulator path planning is low due to the huge action space and state space leading to sparse rewards, and it becomes challenging to evaluate the value of both states and actions given the immense number of states and actions. To address the above problems, a robotic manipulator planning algorithm based on SAC (Soft Actor-Critic) reinforcement learning was proposed. The learning efficiency was improved by incorporating the demonstrated path into the reward function so that the manipulator imitated the demonstrated path during reinforcement learning, and the SAC algorithm was used to make the training of the manipulator path planning algorithm faster and more stable. The proposed algorithm and Deep Deterministic Policy Gradient (DDPG) algorithm were used to plan 10 paths respectively, and the average distances between paths planned by the proposed algorithm and the DDPG algorithm and the reference paths were 0.8 cm and 1.9 cm respectively. The experimental results show that the path imitation mechanism can improve the training efficiency, and the proposed algorithm can better explore the environment and make the planned paths more reasonable than DDPG algorithm.
Resource load prediction with high accuracy can provide a basis for real-time task scheduling, thus reducing energy consumption. However, most prediction models for time series of resource load make short-term or long-term prediction by extracting the long-time series dependence characteristics of time series and neglecting the short-time series dependence characteristics of time series. In order to make a better long-term prediction of resource load, a new edge computing resource load prediction model based on long-short time series feature fusion was proposed. Firstly, the Gram Angle Field (GAF) was used to transform time series into image format data, so as to extract features by Convolutional Neural Network (CNN). Then, the CNN was used to extract spatial features and short-term data features, the Long Short-Term Memory (LSTM) network was used to extract the long-term time series dependent features of time series. Finally, the extracted long-term and short-term time series dependent features were fused through dual-channel to realize long-term resource load prediction. Experimental results show that, the Mean Absolute Error (MAE), Root Mean Square Error (RMSE) and R-squared(R2) of the proposed model for CPU resource load prediction in Alibaba cloud clustering tracking dataset are 3.823, 5.274, and 0.815 8 respectively. Compared with the single-channel CNN and LSTM models, dual-channel CNN+LSTM and ConvLSTM+LSTM models, and resource load prediction models such as LSTM Encoder-Decoder (LSTM-ED) and XGBoost, the proposed model can provide higher prediction accuracy.
Focusing on the issue that division is complex and needs a large delay to compute, a kind of method for designing the unit of high-performance double precision floating point divider based on Goldschmidt's algorithm was proposed and it supported IEEE-754 standard. Firstly, it was analyzed that how to compute division using Goldschmidt's algorithm and the error produced during iterative operation. Then, the method for controlling error was proposed. Secondly, bipartite reciprocal tables were adopted to calculate initial value of iteration with area saving, and parallel multipliers were adopted in the iterative unit for accelerating. Lastly, the executed station was divided reasonably and it made floating point divider supporting pipeline execution with state machine controlling. So, the speed of divider was improved. The experimental results show that the double precision floating point divider adopted 14-bit iterative initial value pipeline structure, its synthesis cell area is 84902.2618 μm2, the running frequency is up to 2.2 GHz with 40 nm technology. Compared with 8-bit iterative initial value pipeline structure, computing speed is increased by 32.73% and area is increased by 5.05%. The delay of a double precision floating division instruction is 12 cycles, and it is decreased to 3 cycles in pipeline execution. Compared with the divider based on SRT algorithm implemented in other processers, data throughput is improved by 3-7 times. Compared with the divider based on Goldschmidt's algorithm implemented in other processers, data throughput is improved by 2-3 times.
For the traditional player skill estimation algorithms based on probabilistic graphical model neglect the first-move advantage (or home play advantage) which affects estimation accuracy, a new method to model the first-move advantage was proposed. Based on the graphical model, the nodes of first-move advantage were introduced and added into player's skills. Then, according to the game results, true skills and first-move advantage of palyers were caculated by Bayesian learning method. Finally, predictions for the upcoming matches were made using those estimated results. Two real world datasets were used to compare the proposed method with the traditional model that neglect the first-move advantage. The result shows that the proposed method can improve average estimation accuracy noticeably.